1 Preliminaries

1.1 Software dependencies

This document does not install the required R packages by default. You can run the script install.R to install all required dependencies on a new R installation, or use install.packages(..) to install missing R packages.

1.2 Required libraries

library(tidyverse)
library(forcats)
library(kableExtra)
library(here)
library(stringr)
library(googlesheets4)
library(likert)
library(magick)
library(patchwork)
library(ggthemes)
library(scales)
library(gridExtra)

1.3 Reproduce paper

To create the PDF of the computational notebook you can run the following commands in a new R session. If you have problems rendering the PDF you can execute each chunk independently in RStudio.

require("knitr")
require("rmarkdown")
rmarkdown::render("self-assessment-experiment.Rmd", output_format = "pdf_document")

2 Data & methods

2.1 Participants

Eligible participants were initially all students enrolled in the Master Thesis Project in the academic year 2019/2020 from two GIS-related master programmes: Erasmus Mundus MSc in Geospatial Technologies and MSc Geomatics. The former is offered by three universities (UJI, NOVA IMS Lisboa and Münster), the latter by TU Delft. As participation was voluntary, the experiment was not counted towards the final grade of the master’s thesis.

Students who also answered the questionnaires but were not enrolled in the master programmes above (e.g., PhD students who participated in a doctoral course at UJI) were removed from the analysis. Fields that refer to personal information (email, thesis handlers, url to pdf files) were also removed and therefore not used in the analysis.

2.2 Methodological steps

A brief summary of the steps followed are described below. Additional notes are also available.

  • Step 1. Participants answered the first questionnaire (pre-test assessment) at the start of the semester (or master thesis project/course).

  • Step 2. In the OSF web site, participants were provided with an introductory 5-minute video lecture to the initiative and 20-minute video lecture to the topic of reproducibility.

  • Step 3. In the same web site, participants were also given 3 self-study, self-paced assignments (in PDF format) to introduce them to the methods and tools for reproducibility research. To support them, we also set up a GitHub repo as a discussion service to allow all students to discuss questions with regard to the assignments and the reproducibility self-assessment of their theses.

  • Step 4. Participants were asked to self-assess the level of reproducibility of their master theses, as explained in the first video lecture. In practice, it meant to add a simple statement (last line) to the abstract indicating the level (score) of each of the five criteria (Nüst et al. 2018).

  • Step 5. At the end of the semester (or master thesis project/course), participants answered a second questionnaire (post-test assessment).

2.3 Questionnaires

The list of questions of each questionnaire were discussed and agreed in a shared document.

Questionnaire #1 (pre-test) is divided in three sections:

  1. Previous knowledge/experience on reproducible research,
  2. Current working practices, and
  3. Previous education background/work experience/etc.

Questionnaire #2 (post-test) is divided in two sections:

  1. Knowledge/experience acquired on reproducible research compared to what was stated in questionnaire #1, and
  2. Perceived importance of reproducibility in their future careers.

Section 1 in both questionnaires is pretty much the same, as the list of 15 terms are asked at the start and end of the semester.

2.5 Reproducibility self-assessment data

The master’s theses of the students of the Erasmus Mundus MSc in Geospatial Technologies are openly available in an institutional repository. Most of the participating students added the self-assessment sentence in the specified format.

The master’s theses of the students of the TU Delft’s MSc Geomatics are openly available in an instituational repository. Here, none of the students added the self-assessment sentence as requested. However, some of them added an appendix to their theses to reflect on the reproducibility of the thesis in a narrative way.

3 Results

The following plots and tables are based on the files pretest_filtered.csv and postest_filtered.csv in the folder data. Only students from MSc in Geospatial Technologies (UJI, NOVA IMS, IFGI-WWU) and MSc Geomatics (TU Delft) are analysed.

3.1 MSc in Geospatial Technologies (UJI, NOVA IMS, IFGI-WWU)

3.1.1 Questionnaire #1 (pre-test)

  • 25 students did eventually deliver the master thesis project.

  • 48% (12/25) of students answered the first questionnaire. All but one added the self-assessment statement to the thesis manuscript.

  • 52% (13/25) of students added a self-assessment statement to the master thesis. Two of them did not answered the questionnaire #1, so they are not included in the analysis.

3.1.1.1 Section 1: Previous knowledge/experience on reproducible research

2 (out of 12) students answered that they have been previously trained on reproducible research practices.

Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?

Percentage of answers in the disagree (low), agree (high) or neutral categories. Mean response with standard deviation using numeric values 1 to 5 for ‘Strongly disagree’ to ‘Strongly agree.’ Terms are ordered by ‘high’ category.
School of thought Term low (%) neutral (%) high (%) mean sd
democratic Open source 8.33 0.00 91.67 4.58 0.90
democratic Open data 16.67 0.00 83.33 4.25 1.14
democratic Open access 16.67 8.33 75.00 4.00 1.13
democratic License 0.00 33.33 66.67 4.17 0.94
infrastructure Data repositories 33.33 0.00 66.67 3.50 1.57
democratic Intellectual property rights 8.33 33.33 58.33 3.67 0.89
democratic Data/code versioning 16.67 25.00 58.33 3.58 1.00
pragmatic Execution environments 33.33 16.67 50.00 3.00 1.21
infrastructure Code repositories 33.33 16.67 50.00 3.42 1.44
public Citizen science 25.00 33.33 41.67 2.92 1.24
pragmatic Digital notebooks 33.33 25.00 41.67 3.17 1.27
pragmatic Reproducible packages 50.00 8.33 41.67 2.83 1.34
infrastructure Collaborative coding repositories 41.67 25.00 33.33 2.92 1.38
public Science dissemination 33.33 33.33 33.33 2.75 1.22
pragmatic Analytical workflows 58.33 16.67 25.00 2.42 1.16
infrastructure Containers platforms 41.67 33.33 25.00 2.67 1.30
public Science blogging 25.00 50.00 25.00 2.92 1.16
pragmatic Computational essays 41.67 41.67 16.67 2.58 1.24

Table above is the raw data to create the likert-scale plot in next Figure. Students had previous knowledge (yellowish answers >= 50%) for the 9 top terms, which are Open source, Open data, Open access, License, Data repositories, Intellectual property rights, Data/code versioning, Execution environments, Code repositories. The next term, Citizen science, is to some extent knowledgeable, as 1/3 opted for the neutral response. The rest of terms, Digital notebooks, Reproducible packages, Collaborative coding repositories, Science dissemination, Analytical workflows, Containers platforms, Science blogging, Computational essays, are in general poorly known and/or understood. This suggests that terms closely connected to reproducibility (mostly from the pragmatic School of Thought, see (Fecher and Friesike 2014)) were in general unknowledgeable (reddish answers predominate).

Questions related to previous experience in reproducibility, main difficulties and perceived importance

Previous experience in reproducibility, main difficulties and perceived importance, using numeric values 1 to 5 for ‘No important at all’ to ‘Really important.’
Have you ever experienced DIFFICULTIES IN REUSING somebody else’s code / data? Have you ever had to REUSE YOUR OWN past code/data? If you answered YES in either of the last two questions, please explain which were the MAIN DIFFICULTIES you experienced. If NO, skip question Do you know where to look for HELP and extra information to make your research reproducible? Please rate the perceived IMPORTANCE of doing your research in a reproducible way According to you, why is it IMPORTANT (or not) to do your research in a reproducible way?
YES YES Recalling the work flow of code was difficult especially when i used multiple classes and libraries. Code comments do help but only for limited code. Proper documentation was missing for my codes. no 3 Reproducibility is good in most cases but there are some research projects that should be not so open due to confidentiality and also to protect interests of researchers. If the research result is a kind of product then openness and reproducibility will give no benifit to the researcher.
YES YES To get the idea of what is done in each step of the workflow. A bit. 4 If the work is reproducible, the value is significantly higher as it can be used without much more work by others. This enables improvements of the idea and process.
YES YES Problem of using static code in dynamic code. No.  5 To pursue with research works and participate in conferences.
NO NO
No, i dont know 5 Ability to obtain similar results on research objective or questions independent on the study or even experiment
YES YES Most time consuming part is understanding others code if they are not properly commented. Even in my code sometimes my own comments are tough to recall later. No 5 Re-usability is one of the key step to move forward research words. Ease to understand other words in a scientific manner will help definitely.
YES YES To underestand the flow an the logic of the code No 4 A research that can not be reproducible, did not have value
YES YES Code was structured to a certain tool and tool was not open. Data was not available in later time. Only internet searches. 5 Reproducibility is essential because the scientific results are affected by data structure, data content, processing method and presentation methods. There are more variables and reproducibility enhances proper understanding and control of research.
YES NO
Not really 5 It would definitely be helpful, If I or someone else would like to replicate or continue this work in different area
YES YES Documentation No 4 It is important so as to verify the scientific workflow but not always possible due to restrictions in data.
YES YES Lack of README Not really 5 not reproducible, means useless, and we don’t know what’s wrong if we run it again
YES YES not well documented and function parameter some time confused me even if I use my past codes NO 4 it can easy and fast the productivity
YES YES The data itself and the pre-processing of the data before the analysis. Haven’t really tried, but I try to keep all well documented. 5 Because so many people is doing the same thing over and over, and for a person that is learning is better to just help to understand what you did.

The plots below refer to the questions above with numeric/logical answers.

A large majority of students encountered difficulties trying to reproduce their own code or someone else’s code (A and B). The third column in the table above shows the problems the students faced. In general students gave importance to reproducibility research practices (C). As students had the chance to reproduce others’ works, reproducibility practices seem to be key for them; however, they find barriers to put these practices into practice.

3.1.1.2 Section 2: Current working practices

Note that responses to ‘working practices’ were multi-choice (N=12).
Practice group - Survey question
analysis What tools do you plan to use to ANALYSE data?
visualisation What tools do you plan to use to VISUALISE/PLOT data?
writing What tools do you plan to WRITE UP your master thesis or conference/journal article?
workflow What is your (expected) process of GETTING the summary data, statistical results, figures, maps and tables IN your master thesis document (or conference/journal article)?

3.1.1.3 Section 3: Previous education background / work experience /…

Previous education background and work experience, and prospects for future.
What BACHELOR DEGREE (or equivalent) did you have when applying for the master programme? How many years of PROFESSIONAL EXPERIENCE did you have when applying for the master programme? In which context are you DEVELOPING your master thesis? Please select which sentence better describes the PLAN you have after the completion of your Master thesis
Computational Physics Over 2 years At the university Not sure yet
Bachelor of engineering None At the university Not sure yet
Information Technology(Engineering) Up to 2 years At the university Continue with doctoral studies (or another master degree)
BCS in Computer science Over 2 years At the university Continuing in my previous job (teaching GIS in university)
Bachelor in Computer Science and Engineering. Over 2 years At the university Find a job in academia (researcher, technician, etc) or in education-related institutions (teacher, etc)
Geography None As internship in industry Find a job in academia (researcher, technician, etc) or in education-related institutions (teacher, etc)
Bachelor in Geomatics Engineering Over 2 years At the university Find a job in government agencies or institutions
Bachelor in Geomatics Engineering Over 2 years At the university Have to continue previous job (Government ) for some period
Urban and Regional Planning None At the university Continue with doctoral studies (or another master degree)
Computer Science Over 2 years At the university Find a job in industry (or set up own company)
Civil engineering Over 2 years At the university Continue with doctoral studies (or another master degree)
Natural Renewable Resource Engeneering Up to 2 years At the university Find a job in industry (or set up own company)

3.1.2 Questionnaire #2 (post-test)

28% (7/25) of students answered the second questionnaire. All added the self-assessment statement to the thesis manuscript.

3.1.2.1 Section 1: Previous knowledge/experience on reproducible research

Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?

Percentage of answers in the disagree (low), agree (high) or neutral categories. Mean response with standard deviation using numeric values 1 to 5 for ‘Strongly disagree’ to ‘Strongly agree.’ Terms are ordered by ‘high’ category.
School of thought Term low (%) neutral (%) high (%) mean sd
democratic Open access 0.00 0.00 100.00 4.86 0.38
democratic Open data 0.00 14.29 85.71 4.57 0.79
infrastructure Code repositories 0.00 14.29 85.71 4.57 0.79
democratic Open source 0.00 14.29 85.71 4.71 0.76
pragmatic Reproducible packages 0.00 14.29 85.71 4.29 0.76
infrastructure Data repositories 0.00 14.29 85.71 4.43 0.79
pragmatic Analytical workflows 28.57 0.00 71.43 3.71 1.25
infrastructure Collaborative coding repositories 28.57 0.00 71.43 3.71 1.25
democratic Intellectual property rights 0.00 42.86 57.14 3.71 0.76
pragmatic Digital notebooks 42.86 0.00 57.14 3.43 1.40
pragmatic Execution environments 14.29 28.57 57.14 3.86 1.21
public Science blogging 28.57 14.29 57.14 3.43 1.13
public Science dissemination 28.57 14.29 57.14 3.14 1.57
democratic Data/code versioning 0.00 57.14 42.86 3.71 0.95
public Citizen science 42.86 14.29 42.86 3.00 1.41
democratic License 14.29 57.14 28.57 3.43 1.13
pragmatic Computational essays 42.86 28.57 28.57 3.00 1.53
infrastructure Containers platforms 42.86 28.57 28.57 2.71 1.50

Table above is the raw data to create the likert-scale plot in next Figure.

Have you read/watched the available materials (slides, videos, additional papers, etc)?

3.1.2.2 Section 2: Perceived importance of reproducibility in their future careers

Perceived importance of reproducibility in the future, using numeric values 1 to 5 for ‘No important at all’ to ‘Really important.’
Are you going to use/adapt reproducible research practice(s) in your current and/or future research projects (master thesis, doctoral thesis, etc.)? Are you planning to learn more about reproducible research on your own? How important are reproducibility practices for your future professional career (academia, industry, government, etc.)?
Not now, but I will definitively use/adapt them in the future 4 4
Yes, I am going to use/adapt them from now 4 5
Yes, I am going to use/adapt them from now 5 5
Not now, but I will definitively use/adapt them in the future 5 4
Yes, I am going to use/adapt them from now 4 4
Yes, I am going to use/adapt them from now 5 5
Not now, but maybe I will explore them in the future 2 4

The plots below refer to the questions above.

3.1.3 Side-by-side comparison

Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?

Future work: Determine statistically change between pre-test and post-test likert questions using Wilcoxon signed-rank test

3.1.4 Results of self-assessment reproducibility

Example of self-assessment statement included in a thesis abstract

Example of self-assessment statement included in a thesis abstract

The distribution of the reproducibility levels by criteria contrasts notably with the results obtained for the evaluation of the AGILE/GIScience papers (Nüst et al. 2018). Surprisingly, all the criteria achieve level 3, the highest (ideal) reproducibility level. NA level count (1 per criterion) corresponds to the student who answered the questionnaire but did not do the self-assessment. So, NA’s are not relevant for the analysis; none of the students who did the self-assessment assigned NA values to the criteria. Indeed, the lowest score for any criteria was 1 which sets the bar quite high. Without a reproducibility check by a third person, these self-assessment values seem to be quite inflated, well above the standards level we found in academic publications. This can be interpreted at least in two ways. Or master theses were of outstanding quality with regards reproducibility; or, the most credible interpretation, students did not understand well the reproducibility levels and criteria and failed to do an objective self-evaluation.

3.2 MSc Geomatics (TU Delft)

3.2.1 Questionnaire #1 (pre-test)

  • 30 students did eventually deliver the master thesis project.

  • 47% (14/30) of students answered the first questionnaire.

3.2.1.1 Section 1: Previous knowledge/experience on reproducible research

1 (out of 14) students answered that they have been previously trained on reproducible research practices.

Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?

Percentage of answers in the disagree (low), agree (high) or neutral categories. Mean response with standard deviation using numeric values 1 to 5 for ‘Strongly disagree’ to ‘Strongly agree.’ Terms are ordered by ‘high’ category.
School of thought Term low (%) neutral (%) high (%) mean sd
infrastructure Code repositories 0.00 7.14 92.86 4.64 0.63
democratic Open access 0.00 14.29 85.71 4.50 0.76
democratic Open source 0.00 14.29 85.71 4.50 0.76
infrastructure Collaborative coding repositories 14.29 0.00 85.71 4.07 1.21
infrastructure Data repositories 0.00 14.29 85.71 4.29 0.73
democratic Open data 0.00 21.43 78.57 4.36 0.84
democratic Data/code versioning 0.00 21.43 78.57 4.36 0.84
democratic License 0.00 28.57 71.43 4.00 0.78
democratic Intellectual property rights 21.43 28.57 50.00 3.50 1.45
pragmatic Analytical workflows 35.71 28.57 35.71 3.07 1.38
pragmatic Digital notebooks 50.00 28.57 21.43 2.64 1.50
pragmatic Execution environments 42.86 35.71 21.43 2.64 1.39
public Science blogging 57.14 21.43 21.43 2.36 1.34
infrastructure Containers platforms 57.14 28.57 14.29 2.14 1.17
public Citizen science 57.14 28.57 14.29 2.14 1.17
public Science dissemination 71.43 14.29 14.29 2.14 1.23
pragmatic Reproducible packages 57.14 35.71 7.14 2.14 1.23
pragmatic Computational essays 71.43 28.57 0.00 1.64 0.93

Table above is the raw data to create the likert-scale plot in next Figure.

Questions related to previous experience in reproducibility, main difficulties and perceived importance

Previous experience in reproducibility, main difficulties and perceived importance, using numeric values 1 to 5 for ‘No important at all’ to ‘Really important.’
Have you ever experienced DIFFICULTIES IN REUSING somebody else’s code / data? Have you ever had to REUSE YOUR OWN past code/data? If you answered YES in either of the last two questions, please explain which were the MAIN DIFFICULTIES you experienced. If NO, skip question Do you know where to look for HELP and extra information to make your research reproducible? Please rate the perceived IMPORTANCE of doing your research in a reproducible way According to you, why is it IMPORTANT (or not) to do your research in a reproducible way?
YES YES Lack of commenting and other variable names than I would choose made re-use difficult for me. Sometimes you do not know what kind of thing a function returns, is it a list or a dictionary, this influences how you use the output. Yes, now that we have this course I can look it up here. 3 I think it is important to some degree that when it matters it can be verified if the results of a research are correct, however, I dont think you should over do it or at least keep this separate from the actual work because it might cause information overload. I can also imagine that if you do research with users and actual people, reproducibility might be more inconsistent and can even lead to privacy issues.
YES YES the environment of my system is often different from the original code NO 4 That it can be used by other researchers as well
YES YES Dependency problems, lack of comments in the code etc No 5 So the results can be validated by others
YES YES Hard to configure the execution environments with a C++ project in GitHub. No.  5 Avoid re-invent of the wheels and others can continue with your work.
YES YES difficult to find out what the required input formats are for code to work and difficult to see which hardcoded elements I need to change NO 4 It is important so that if I made mistakes in my research people can see this instead of that they just believe that all my conclusions are true.
YES YES The main problem often is to make the data/code suitable for your problem. With geographic data I experienced problems with the coordinate reference systems used (sometimes not well documented, reprojection difficult). With own code I’ve not really experienced big problems. I have never looked this up before. I would look things up on Google. 4 It is extremely useful if other people can use your research and if it’s well documented. If other people can understand and use your research, they might even find ‘problems’ or improvements, which will benefit the research even more.
YES YES No sufficient documentation on what the code actually does/means Documentation/Github 5 Validity of the research can be tested
YES YES understanding and finding for which case, datasets no 5 We speak of knowledge when methods are always leading to the same results under the same circumstances. This could only be seen if the research is reproducible.
YES YES understanding and finding for which case, datasets no 5 We speak of knowledge when methods are always leading to the same results under the same circumstances. This could only be seen if the research is reproducible.
YES YES To reuse someone’s code I need to understand it and then check the possibility to use it (which license used to publish it). No 4 Important: to have defined clearly your topic, then to describe the methodology and the implemented algorithms and present and clarify your results
YES YES To reuse someone’s code I need to understand it and then check the possibility to use it (which license used to publish it). No 4 Important: to have defined clearly your topic, then to describe the methodology and the implemented algorithms and present and clarify your results
NO YES Too messy/unstructured. Comments were generally fine. No.  5 It can proof that your research has been conducted in a proper way. Also, others can note mistakes. Or it can be useful for further research.
YES YES
  1. Installation on different hardware and cuda requirements
  2. Code is difficult to debug because of lack of structure
Normally I look on google whenever i need any information about better structuring my code and mainly follow github style for major codes. But not explicitly 4 I would say it is important to do research in reproducible way to get feedback from community and improve the current status by additional feedback. But also it happens that it is difficult to do somethings in reproducible manner due to shortage of time. But I think the way Geomatics taught us coding and research is good way to reproduce because I was able to use my previous code lot of times without problems.
YES YES The biggest issue of reusing code of others is sometimes because of the version difference of used software and libraries, it’s quite hard to configure the environment correctly. yes 5 Because doing research in a reproducible way can not only help others who work on similar topics, but also help the future work of our own.

The plots below refer to the questions above wih numeric/logical answers.

3.2.1.2 Section 2: Current working practices

Note that responses to ‘working practices’ were multi-choice (N=14).
Practice group - Survey question
analysis What tools do you plan to use to ANALYSE data?
visualisation What tools do you plan to use to VISUALISE/PLOT data?
writing What tools do you plan to WRITE UP your master thesis or conference/journal article?
workflow What is your (expected) process of GETTING the summary data, statistical results, figures, maps and tables IN your master thesis document (or conference/journal article)?

3.2.1.3 Section 3: Previous education background / work experience /…

Previous education background and work experience, and prospects for future.
What BACHELOR DEGREE (or equivalent) did you have when applying for the master programme? How many years of PROFESSIONAL EXPERIENCE did you have when applying for the master programme? In which context are you DEVELOPING your master thesis? Please select which sentence better describes the PLAN you have after the completion of your Master thesis
Maritime Engineering None At the university Find a job in industry (or set up own company)
two bachelor degrees: (IT-Engineering & Business Consulting) & Geography Up to 1 year As internship in industry Continue with doctoral studies (or another master degree)
M. Eng. Rural and Surveying Engineering Up to 1 year As internship in industry Not sure yet
Geomatics None At the university Continue with doctoral studies (or another master degree)
Future Planet Studies Up to 6 months As internship in industry Find a job in industry (or set up own company)
Computer Science Up to 1 year At the university Not sure yet
Architecture for the Built Enviornment None As internship in industry Not sure yet
earth and economics None As internship in industry Find a job in government agencies or institutions
earth and economics None As internship in industry Find a job in government agencies or institutions
Integrated Master of Rural and Surveying Engineering Up to 1 year As internship in industry Either in institutions or in industry
Integrated Master of Rural and Surveying Engineering Up to 1 year As internship in industry Either in institutions or in industry
Human Geography None At the university Rest (take a gap year or similar)
Agriculture and Food Engineering Over 2 years At the university Continue with doctoral studies (or another master degree)
Remote Sensing Over 2 years At the university Find a job in industry (or set up own company)

3.2.2 Questionnaire #2 (post-test)

10% (3/30) of students answered the second questionnaire.

3.2.2.1 Section 1: Previous knowledge/experience on reproducible research

Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?

Percentage of answers in the disagree (low), agree (high) or neutral categories. Mean response with standard deviation using numeric values 1 to 5 for ‘Strongly disagree’ to ‘Strongly agree.’ Terms are ordered by ‘high’ category.
School of thought Term low (%) neutral (%) high (%) mean sd
pragmatic Execution environments 0.00 0.00 100.00 4.00 0.00
public Science blogging 0.00 0.00 100.00 4.00 0.00
democratic Open access 0.00 0.00 100.00 4.33 0.58
democratic Open data 0.00 0.00 100.00 4.67 0.58
democratic Open source 0.00 0.00 100.00 4.67 0.58
democratic Data/code versioning 0.00 0.00 100.00 4.67 0.58
democratic License 0.00 0.00 100.00 4.33 0.58
infrastructure Collaborative coding repositories 0.00 0.00 100.00 4.33 0.58
infrastructure Code repositories 0.00 0.00 100.00 4.67 0.58
infrastructure Data repositories 0.00 0.00 100.00 4.67 0.58
democratic Intellectual property rights 0.00 33.33 66.67 4.00 1.00
pragmatic Computational essays 66.67 0.00 33.33 2.33 1.53
public Citizen science 33.33 33.33 33.33 3.00 1.00
pragmatic Digital notebooks 33.33 66.67 0.00 2.67 0.58
pragmatic Analytical workflows 33.33 66.67 0.00 2.67 0.58
pragmatic Reproducible packages 33.33 66.67 0.00 2.67 0.58
infrastructure Containers platforms 33.33 66.67 0.00 2.33 1.15
public Science dissemination 0.00 100.00 0.00 3.00 0.00

Table above is the raw data to create the likert-scale plot in next Figure.

Have you read/watched the available materials (slides, videos, additional papers,etc)?

3.2.2.2 Section 2: Perceived importance of reproducibility in their future careers

Perceived importance of reproducibility in the future, using numeric values 1 to 5 for ‘No important at all’ to ‘Really important.’
Are you going to use/adapt reproducible research practice(s) in your current and/or future research projects (master thesis, doctoral thesis, etc.)? Are you planning to learn more about reproducible research on your own? How important are reproducibility practices for your future professional career (academia, industry, government, etc.)?
Yes, I am going to use/adapt them from now 2 3
Yes, I am going to use/adapt them from now 4 5
Not now, but maybe I will explore them in the future 3 4

The plots below refer to the questions above.

3.2.3 Side-by-side comparison

Imagine that you are going to explain the following terms to somebody else. How well do you KNOW them?

It does not make sense to determine statistically change between pre-test and post-test likert questions because of the big difference between the number of responses per questionnaire: 14 and 3, respectively.

3.2.4 Results of self-assessment reproducibility

As said previously, TU Delft students did not provide the self-assessment statement. Yet, some of them added a 1-page reflection to their thesis manuscripts. TODO: to be commented

4 Concluding remarks

  • Observed an awareness of reproducibility in all theses that included the self-assessment.
  • Meta-assessment: students’ self-assessment was generally too optimistic.
  • Can self-study & self-assessment influence the degree of reproducibility?
  • Turn into a long-term study on the impact of MSc-level exposure to reproducibility?
  • New requirements on MSc Geomatics

References

Fecher, Benedikt, and Sascha Friesike. 2014. “Open Science: One Term, Five Schools of Thought.” In Opening Science, 17–47. Springer, Cham. https://doi.org/10.1007/978-3-319-00026-8_2.
Granell, Carlos, Rusne Sileryte, and Daniel Nüst. 2020. “Reproducible Graduate Theses in GIScience.” In Abstract Proceedings of Research Reproducibility 2020. https://pwd.aa.ufl.edu/researchre-pro/wp-content/uploads/sites/8/2020/11/Paper_2-1_Granell_Carlos.pdf.
Nüst, D, C Granell, B Hofer, M Konkol, FO Ostermann, R Sileryte, and V Cerutti. 2018. “Reproducible Research and GIScience: An Evaluation Using AGILE Conference Papers.” PeerJ 6: e5072. https://doi.org/10.7717/peerj.5072.